Formant-based frequency warping for improving speaker adaptation in HMM TTS

نویسندگان

  • Xin Zhuang
  • Yao Qian
  • Frank K. Soong
  • Yi-Jian Wu
  • Bo Zhang
چکیده

Vocal Tract Length Normalization (VLTN), usually implemented as a frequency warping procedure (e.g. bilinear transformation), has been used successfully to adapt the spectral characteristics to a target speaker in speech recognition. In this study we exploit the same concept of frequency warping but concentrate explicitly on mapping the first four formant frequencies of 5 long vowels from source and target speakers. A universal warping function is thus constructed for improving MLLR-based speaker adaptation performance in TTS. The function first warps the frequency scale of the source speaker’s speech data toward that of the target speaker and an HMM of the warped features is trained. Finally, MLLR-based speaker adaptation is applied to the trained HMM for synthesizing the target speaker’s speech. When tested on a database of 4,000 sentences (source speaker) and 100 sentences of a male and a female speaker (target speakers), the formant based frequency warping has been found very effective in reducing the objective, log spectral distortion over the system without formant frequency warping. The improvement is also subjectively confirmed in AB preference and ABX speaker similarity listening tests.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Framework Of Feature Based Adaptation For Statistical Speech Synthesis And Recognition

The advent of statistical parametric speech synthesis has paved new ways to a unified framework for hidden Markov model (HMM) based text to speech synthesis (TTS) and automatic speech recognition (ASR). The techniques and advancements made in the field of ASR can now be adopted in the domain of synthesis. Speaker adaptation is a well-advanced topic in the area of ASR, where the adaptation data ...

متن کامل

Frequency Warping for Speaker Adaptation in HMM-based Speech Synthesis

Speaker adaptation in speech synthesis transforms a source utterance to a target utterance that differs from the source in terms of voice characteristics. In this paper, we employ vocal tract length normalization, which is generally used in speech recognition to remove individual speaker characteristics, to speaker adaptation in speech synthesis. We propose a frequency warping approach based on...

متن کامل

Explorer Unsupervised cross - lingual speaker adaptation for HMM - based speech synthesis

In the EMIME project, we are developing a mobile device that performs personalized speech-to-speech translation such that a user’s spoken input in one language is used to produce spoken output in another language, while continuing to sound like the user’s voice. We integrate two techniques, unsupervised adaptation for HMM-based TTS using a wordbased large-vocabulary continuous speech recognizer...

متن کامل

Speaker adaptation using only vocalic segments via frequency warping

Speaker adaptation techniques allow hidden Markov model (HMM) based speech synthesis systems to mimic a target voice of which a few samples are available. However, usual adaptation approaches are not applicable when the target voice is dysarthric, i.e. the target speaker has an impairment which prevents the correct pronunciation of some phonemes. As a first step towards giving personalized synt...

متن کامل

تخمین سریع ضرایب پیچش در هنجارسازی طول مجرای صوتی با استفاده از امتیاز به دست آمده از مدلسازی تشخیص جنسیت

The performance of automatic speech recognition (ASR) systems is adversely affected by the variations in speakers, audio channels and environmental conditions. Making these systems robust to these variations is still a big challenge. One of the main sources of variations in the speakers is the differences between their Vocal Tract Length (VTL). Vocal Tract Length Normalization (VTLN) is an effe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010